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1. Introduction 

In this paper we are interested in the following question: given a finite 
measure /i, at what speed can it be approximated by finitely supported 
measures? To give a sense to the question, one needs a distance on 
the space of measures; we shall use the Wasserstein distances W p , with 
arbitrary exponent p G [1, +00) (definitions are recalled in Section [2]). 

This problem has been called Quantization for probability distribution, 
the case of exponent p = 1 has also been studied under the name of 
location problem, and the case p = 2 is linked with optimal centroidal 
Voronoi tessellations. After submission of the present article, we became 
aware that the previous works cover much more of the material presented 
than we first thought; see Subsection 11.21 for detailled references. 

This problem could be of interest for a numerical study of transporta- 
tion problems, where measures can be represented by discrete ones. One 
would need to know the number of points needed to achieve some preci- 
sion in the approximation. 

We shall restrict our attention to compactly supported Borelian mea- 
sures on Riemannian manifolds. 



1.1. Statement of the results. — First we show that the order of 
convergence is determined by the dimension of the measure (see defini- 
tions in Section [2]); A^v denotes the set of all measures supported in at 
most N points. 
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Theorem 1.1. - - If p is compactly supported and Alhfors regular of 
dimension s > 0, then 

W p ([M, A N ) « J^yJ- 

Here we write ~ to say that one quantity is bounded above and below 
by positive multiples of the other. Examples of Ahlfors regular mea- 
sures are given by the volume measures on submanifolds, and self-similar 
measures (see for example [7]). Theorem II. H to be proved in a slightly 
more general and precise form in section HJ is simple and unsurprising; 
it reminds of ball packing and covering, and indeed relies on a standard 
covering argument. 

In the particular case of absolutely continuous measures, one can give 
much finer estimates. First, it is easily seen that if d d denotes the uniform 
measure on a Euclidean unit cube of dimension d, then there is a constant 
6(d,p) such that 

W P {D , A N ) ~ ^ TJT 

(Proposition 15.31) . Note that determining the precise value of 9(d,p) 
seems difficult; known cases are discussed in Section [TT2"1 

The main result of this paper is the following, where "vol" denotes 
the volume measure on the considered Riemannian manifold and is the 
default measure for all integrals. 

Theorem 1.2. - - If \i = pvol where p is a compactly supported function 
on a Riemannian manifold (M,g), then for all 1 ^ p < oo we have 

*(d,p)|pl!£ 

(1) W p (im,A n )~. 



N l l d 

where \p\p = (f M p^) 1 ^ is the L 13 "norm", here with (3 < 1 though. 

Moreover, if {^n) is a sequence of finitely supported measures such 
that /xjv £ A 7v minimizes W p (fi, ^n), then the sequence of probability 
measures (/ijv) that are uniform on the support of /ijv converges weakly 

d 

to the multiple of pp+ d that has mass 1. 

Theorem 11.21 is proved in Section [51 Note that the hypothesis that 
p has compact support is obviously needed: otherwise, \p\d/(d+ P ) could 
be infinite. Even when p is in 

L d/(d+ P )^ there ig the 

case where it is 

supported on a sequence of small balls going to infinity. Then the location 
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of the balls is important in the quality of approximation and not only the 
profile of the density function. However, this hypothesis could probably 
be relaxed to a moment condition. 

Theorem 11.21 has no real analog for measures of fractional dimension. 

Theorem 1.3. - - There is a s- dimensional Ahlfors regular measure n 
on 1R (namely, k is the Cantor dyadic measure) such that W p (k, A^N 1 ^ 
has no limit. 

Section [6] is devoted to this example. 

Part of the interest of Theorem 11.21 comes from the following observa- 
tion, to be discussed in Section [71 when p = 2, the support of a distance 
minimizing fi^ E A^ generates a centroidal Voronoi tessellation, that 
is, each point is the center of mass (with respect to fi) of its Voronoi 
cell. We thus get the asymptotic repartition of an important family of 
centroidal Voronoi tessellations, which enables us to prove some sort of 
energy equidistribution principle. 

1.2. Discussion of previously known results. — There are several 
previous works closely related to the content of this paper. 

1.2.1. Foundations of Quantization for Probability Distributions. -- The 
book [10] by Graf and Luschgy (see also the references therein), that we 
only discovered recently, contains many results on the present problem. 
Theorem 11.11 is proved there in section 12, but our proof seems more 
direct. Theorem 11.21 is proved in the Euclidean case in Sections 6 and 7 
(with a weakening of the compact support assumption). A generalization 
of Theorem 11.31 is proved in Section 14, yet we present a proof for the 
sake of self-completeness. 

The case p = 1, M = M. n is usually called the location problem. In 
this setting, Theorem 11.21 has also been proved by Bouchitte, Jimenez 
and Rajesh pQ under the additionnal assumption that p is lower semi- 
continuous. 

Our main motivation to publish this work despite these overlaps is 
that the case of measures on manifold should find applications; for ex- 
ample, good approximations of the curvature measure of a convex body 
by discrete measures should give good approximations of the body by 
polyhedra. 

It seems that the quantization, the location problem and the study 
of optimal CVTs, although the last two are particular cases of the first 
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one, have been only studied independently. We hope that notincing this 
proximity will encourage progress on each question to be translated in 
the others. 

1.2.2. Around the main theorem. - - Mosconi and Tilli in |15] have stud- 
ied (for any exponent p, in M n ) the irrigation problem, where the approx- 
imating measures are supported on connected sets of length < I (the 
length being the 1-dimensional Hausdorff measure) instead of being sup- 
ported on iV points; the order of approximation is then 

Brancolini, Buttazzo, Santambrogio and Stepanov compare in [2] the 
location problem with its "short-term planning" version, where the sup- 
port of supp/XAf is constructed by adding one point to that of /Ujv-i, 
minimizing the cost only locally in N. 

1.2.3. Approximation constants for cubes. -- Some values of 9 (d,p) have 
been determined. First, it is easy to compute them in dimension 1: 

(p + iy 1/p 

9(l,p) = ^-^ . 

The case d = 2 has been solved by Fejes Toth [8], [9], (and by Newmann 
|16j for p = 2 and Morgan and Bolton [14] for p = 1), see also |10] 
Section 8. In particular 

9(2, 2) = 5^/54 6(2, 1) = 2~ 2/3 3~ 7/4 (4 + In 27). 

When d = 2 and for all p, the hexagonal lattice is optimal (that is, 
the given 9 is the distance between the uniform measure on a regular 
hexagon and a Dirac mass at its center). All other cases are open to our 
knowledge. For numerical evidence in the case p = 2, d = 3 see Du and 
Wang [6]. Note that in the limit case p = oo, determining 9 amounts 
to determining the minimal density of a ball covering of lR d , which is 
arguably as challenging as determining the maximal density of a ball 
packing, a well-known open problem if d > 3. 

1.2. 4- Random variations. - Concerning the order of convergence, it 
is worth comparing with the problem of estimating the distance from a 
measure p to empirical measures Pn = N^ 1 J2k $x k where X 1 , . . . , X N 
are independent random variables of law p. It seems that p^ is almost 
optimal in the sense that W 2 (p, Pn) ~ C N" 1 ^ almost surely (under mo- 
ment conditions, but here we take p compactly supported so this is not an 
issue); Horowitz and Karandikar have shown in [12] that W 2 (p,Pn) has 
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the order at most A r_1 ^ d+4 ^ and the better exponent above is suggested 
in the Mathematical Review of that paper. 

Let us also briefly note that the optimal matching problem for random 
data is related to our problem. Simply put, one can say that if p,' N 
is another empirical measure of fi, then W^ipN, P>'n) a l so nas the order 
N~ l l d if d ^ 3 (see for example Dobric and Yukich [3]). In the same 
flavour, other optimisation problems for random data have been studied 
(minimal length covering tree, traveling salesperson problem, bipartite 
version of those, etc.) 

1.2.5. Centroidal Voronoi Tesselations. - - In the case p — 2, the prob- 
lem is linked to (optimal) centroidal Voronoi Tesselation, see Section [7] 
and [5]. In that paper (Section 6.4.1), the principle of energy equidis- 
tibution is given in the 1-dimensional case for smooth density p. Our 
corollary 17.11 in the last section generalize this to non regular densities, 
all exponents, and all dimensions; it is however quite a direct consequence 
of Theorem 11.21 

1.3. Related open questions. — The number N of points of the 
support may be the first measure of complexity of a finitely supported 
measure that one comes up with, but it is not necessarily the most rel- 
evant. Concerning the problem of numerical analysis of transportation 
problems, numbers are usually encoded in a computer by floating num- 
bers. One could therefore define the complexity of a measure supported 
on points of decimal coordinates, with decimal quantity of mass at each 
point as the memory size needed to describe it, and search to minimize 
the distance to a given /i among measures of given complexity. 
Another possible notion of complexity is entropy : one defines 

h ^m^j = - y^mj ln(mj). 

A natural question is to search a \Xh that minimizes the distance to fi 
among the finitely supported measures of entropy at most h, and to 
study the behavior of fih when we let h — > oo. 

Acknowledgements. — I am grateful to Romain Joly, Vincent Mun- 
nier, Herve Pajot, Remy Peyre and Cedric Villani for interesting discus- 
sions or comments. 
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2. Recalls and definitions 

2.1. Notations. — Given two sequences (u n ), (v n ) of non-negative real 
numbers, we shall write: 

~~ u n ^ v n to mean that there exists a positive real a and an integer 
N such that u n ^ av n for all n ^ N , 

- u n « v n to mean u n < v n and w n > v n . 

From now on, M is a given Riemannian manifold of dimension d. By a 
domain of M we mean a compact domain with piecewise smooth bound- 
ary (and possibly corners) and finitely many connected components. 

2.2. Ahlfors regularity and a covering result. — We denote by 
B(x,r) the closed ball of radius r and center x\ sometimes, when B = 
B(x,r) and fc G 1, we denote by kB the ball B(x, kr). 

Let /i be a finite, compactly supported measure on a manifold M of 
dimension d, and let s G (0, +oo). One says that /i is Ahlfors regular of 
dimension s if there is a constant C such that for all x e supp /i and for 
all r ^ diam(supp /i), one has 

C-V ^ /i(B(x,r)) ^ Cr s . 

This is a strong condition, but is satisfied for example by auto-similar 
measures, see [13|, [7j for definitions and Section [6] for the most famous 
example of the Cantor measure. 

Note that if /i is Ahlfors regular of dimension s, then s is the Hausdorff 
dimension of supp /i (and therefore s ^ d), see [114 Sec. 8.7]. 

We shall need the following classical covering result. 

Proposition 2.1 (55 covering). — If X is a closed set and & is a 
family of balls of uniformly bounded diameter such that X C [J^ B, then 
there is a subfamilly of ^ such that: 

- XC [j^5B, 

- B n B' = whenever B ^ B' 

2.3. Wasserstein distances. — Here we recall some basic facts on 
optimal transportation and Wasserstein distances. For more information, 
the reader is suggested to look for example at Villani's book |17j which 
provides a very good introduction to this topic. 

First consider the case p < oo, which shall attract most of our atten- 
tion. A finite measure [i on M is said to have finite p-th moment if for 
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some (hence all) Xq e M the following holds: 

/ d(xo, x) p fx(dx) < +00 
J~g. d 



In particular, any compactly supported finite measure has finite p-th 
moment for all p. 

Let //0)A*i be two finite measures having finite p-th moment and the 
same mass. A transport plan between /xq and \i\ is a measure II on 
M x M that has /aq and \i\ as marginals, that is : 11(74 x M) = liq{A) 
and I1(M x A) = Li\(A) for all Borelian set A. One shall think of II has 
a assignement of mass: 11(74 x B) represents the mass sent from A to B. 

The L p cost of a transport plan is defined as 



where the infimum is on all tranport plan between liq and Li\. One can 
show that there is always a tranport plan that achieves this infimum, 
and that W p defines a distance on the set of measures with finite p-th 
moment and given mass. 

Moreover, if M is compact W p metrizes the weak topology. If M is 
non-compact, it defines a finer topology. 

Most of the time, one restricts itself to probability measures. Here, 
we shall use extensively mass transportation between submeasures of the 
main measures under study, so that we need to consider measures of 
arbitrary mass. Given positive measures \i and u, we write that 11 ^ v 
if n{A) ^ v{A) for all borelian set A, which means that v — \i is also a 
positive measure. 

It is important to notice that c p (II) is homogeneous of degree 1 in the 
total mass and of degree p on distances, so that in the case M = M. d if ip is 
a similitude of ratio r, we have W p (m <p#pLo, m ¥#^1) — rn 1 ^ r W p (/i , /ii). 

The case p = 00 is obtained as a limit of the finite case, see [3]. Let /iq 
and /ii be compactly supported measures of the same mass and let II be 
a transport plan between /io and The L°° length of II is defined as 




One defines the L p Wasserstein distance by 



W p (hq,/j,i) = inf c p (IT) 



i/p 



£00(11) = sup{<i(x, y) I x, y e supp 11} 



8 



BENOIT KLOECKNER 



that is, the maximal distance moved by some infinitesimal amount of 
mass when applying II. The L°° distance between /i and Hi then is 

Woo(jUo,Aii) = inf ^(II) 

where the infimum is on all transport plan from /i to fii. In a sense, the 
L°° distance is a generalisation to measures of the Hausdorff metric on 
compact sets. We shall use l^, but not d^. The problem of minimizing 
W^n, A N ) is a matter of covering supp \i (independently of /i itself), a 
problem with quite a different taste than our. 

3. Preparatory results 

The following lemmas are useful tools we shall need; the first two at 
least cannot pretend to any kind of originality by themselves. 

Lemma 3.1 (monotony). - - Let fi and v be finite measures of equal 
mass and jl ^ \i. Then there is a measure v ^ v (in particular, supp v C 
supp v ) such that 

W p {p.,v) ^W p (p,u). 

Proof. — Let II be an optimal transportation plan from \i to v. We 
construct a low-cost transportation plan from fi to v by disintegrating 
IT. 

There is family of finite measures (r) x ) xe M such that II = J rj x fj,(dx), 
that is 

U(A x B) = / r] x (B){i{dx) 

J A 

for all Borelian A and B. Define 

IL(A x B) = [ r} x {B)fi{dx) 

J A 

and let v be the second factor projection of EL Since II ^ II, we have 
v ^ v and c p (II) ^ c p (Ef); moreover II is a transport plan from jl to v by 
definition of v. □ 

Lemma 3.2 (summing). - - Let (/i, v) and (fi,^) be finite measures 
with pairwise equal masses. Then 

W^fi + fi,u + i))<: W*{n, v) + W*{fi, v). 
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Proof. — Let IT and II be optimal transport plans between respectively 
fx and u, fx and v. Then II + II is a transport plan between fx + fx and 
v + v whose cost is c p (Il + IT) = c p (II) + c p (U.). □ 



This very simple results have a particularly important consequence 
concerning our question. 

Lemma 3.3 (L 1 stability). — Let fx and fx be finite compactly sup- 
ported measures on M, e £ (0,1) and (jin) be any sequence of N- 
supported measures. 

There is a sequence of N- supported measures jlisr such that there are 
at most eN points in supp /xn\ supp (xn an d 



where Ni is equivalent to (and at least) (1 — s)N, \ ■ \tv is the total 
variation norm and the constant in the O depends only on the geometry 
of a domain where both fx and fx are concentrated. 
In particular we get 



The name of this result has been chosen to emphasize that the total 
variation distance between two absolutely continuous measures is half 
the L l distance between their densities. 

Proof. - - We can write jx = fx' + v where fx' ^ fx and v is a positive 
measure of total mass at most \fx — fi\vT- If -D is a domain supporting 
both fx and fx, it is a classical fact that there is a constant C (depending 
only on D) such that for all integer K, there are points x 1 , . . . , x K d £ D 
such that each point of D is at distance at most C/K from one of the 
Xi. For example if D is a Euclidean cube of side length L, by dividing it 
regularly one can achieve C = L\fd/2. 

Take K = [(eNY^ d \; then by sending each point of D to a closest x i; 
one constructs a transport plan between v and a ^-supported measure 
un whose cost is at most \fx — li\vt{C / K) p . 

Let Ni = N — K d . The monotony lemma gives a measure fx' N ^ fx^ 1 
(in particular, fx' N is ^-supported) such that 





W p (n',n' N ) ^ W p (fx,fx Nl ). 
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The summing lemma now shows that 

C^-fi\ VT 



WP(j2,fi , N + u N )^WP(fi lf i Nl ) + 



Kp 

□ 



Note that the presence of the \(j, — [1\tv factor will be crucial in the sequel, 
but would not be present in the limit case p = oo, which is therefore very 
different. 

Lemma 3-4 (metric stability). — Assume D is a compact domain of 
M , endowed with two different Riemannian metrics g and g' ( defined on 
a open neighborhood of D). Denote by \g' — g\ the minimum number r 
such that 

e~ 2r 9x(v,v) < g' x (v) < e 2r g x (v,v) 

for all x G D and all v G T X M . 

Then, denoting by W p the Wasserstein metric computed using the dis- 
tance d induced by g, and by W' p the one obtained from the distance d! 
induced by g' , one has for all measures /i, v supported on D and of equal 
mass: 

e-\°'-9\W p {n,v) ^ W'^v) ^ e^W p (ji,u). 

Proof. — For all x,y G D one has d'(x,y) e r d(x,y) by computing 
the g'-length of a ^-minimizing (or almost minimizing to avoid regularity 
issues on the boundary) curve connecting x to y. The same reasonning 
applies to transport plans: if II is optimal from fi to v according to 
d, then the d' cost of II is at most e pr times the ci-cost of II, so that 
Wp(fi, v) ^ e r W p (fj,, v). The other inequality follows by symmetry. □ 

Let us end with a result showing that no mass is moved very far away 
by an optimal transport plan to a ^-supported measure if N is large 
enough. 

Lemma 3.5 (localization). - - Let \x be a compactly supported finite 
measure. If hn is a closest N -supported measure to \x in IP Wasserstein 
distance and Un is a LP optimal transport plan between \i and /In, then 
when N goes to oo, 

^(ITv) -)• 0. 

Proof. — Assume on the contrary that there are sequences Nj, —> oo, 
Xk G supp/i and a number e > such that IL^. moves Xk by a distance 
at least e. There is a covering of supp \i by a finite number of balls 
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of radius e/3. Up to extracting a subsequence, we can assume that all 
Xk lie in one of this balls, denoted by B. Since B is a neighborhood 
of Xfc and Xk G supp p, we have /x(-B) > 0. Since is optimal, it 
moves Xk to a closest point in supp/i^r,., which must be at distance at 
least e from Xk- Therefore, every point in B is at distance at least e/3 
from supp/iAr fe , so that Cp(U.N k ) ^ p(B)(e/3) p > 0, in contradiction with 
W p (fM, A N ) -»• 0. □ 



4. Approximation rate and dimension 



Theorem 11.11 is the union of the two following propositions. Note that 
the estimates given do not depend much on p, so that in fact Theorem 
11.11 stays true when p = oo. 

Proposition 4-1- — If p is a compactly supported probability measure 
on M and if for some C > and for all r ^ diam(supp p), one has 

C~V < p{B{x,r)) 

then for all N 

5C 1//s 



Proof. — The 55 covering proposition above implies that given any 5 > 
0, there is a subset W of supp p such that 

- supp^u C \J xejf B(x,55), 

— B(x, 5) fl B(x', 5) = whenever x ^ x' G . 

In particular, as soon as 5 < diam(supp p) one has 

so that is finite, with |^| < 

Let fl be a measure supported on £f , that minimizes the L p distance 
to fi among those. A way to construct p is to assign to a point x G £f 
a mass equal to the //-measure of its Voronoi cell, that is of the set of 
points nearest to x than to any other points in £f . The mass at a point 
at equal distance from several elements of $f can be split indifferently 
between those. The previous discussion also gives a transport plan from 
p to p, where each bit of mass moves a distance at most 55, so that 
W p (p, p) ^ 55 (whatever p). 
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Now, let N be a positive integer and choose 5 = (C/N) 1 ^. The family 
£f obtained from that 5 has less than N elements, so that W p (jj,, A^) ^ 
5(C/iV) 1 / s . □ 

Proposition 4-2. — If fi is a probability measure on M and if for some 
C > and for all r , one has 

n{B(x,r)) < Cr s 

then for all N, 

i/p 



Proof. — Consider a measure /ijv 6 Ajy that minimizes the distance to 
fi. For all 5 > 0, the union of the balls centered at supp /xjv and of radius 
5 has //-measure at most NC5 S . In any transport plan from fi to //at, 
a quantity of mass at least 1 — NC5 S travels a distance at least 5, so 
that in the best case the quantity of mass traveling a distance between 
5 < (NC)- l ' s and 5 + d8 is NCsS'^dS. It follows that 

W p (fi,fi N ) p ^ sNCS'-^dS 
Jo 

so that W p (ji, fi N ) ^ (s/(s + p)) 1 / p (NC)~ 1 / s . □ 

In fact, Theorem 11.11 applies to more general measures, for example 
combination of Ahlfors regular ones, thanks to the following. 

Lemma 4-3. — If fi = ai/i 1 + a 2 fi 2 where ai > and fi l are proba- 
bility measures such that W p (fi 2 , A N ) < W p {jj} , A N ) and W p (/j},An) < 
WpQjL 1 , A 2N ) then W p (jjl,A n ) w W p ((i\A N ). 

Proof. — By the monotony lemma, Wp(aifj}, Ajy) ^ W p (fJ,,Apf) so that 
W p {fi\A N ) ^a^ 1/p W p (fi,A N ). 
The summing lemma gives 

W P (^A 2N ) < (^(a^^^ + iy^/i 2 ^^) 1 ^ 

so that 

W p (fi,A 2N ) < W p (a 1 fj}, A N ) < W p (ft\A 2N ). 
Since W p (n, A 2 n+i) ^ Wp(A* 5 A 2 jv) we also get 

W p (fi, A 2N+1 ) < W p (fi\A 2N ) < W p (n\A m ) <: Wpifi 1 , A 2N+1 ) 

□ 
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The following is now an easy consequence of this lemma. 

Corollary 4-4- — Assume that \i = Y2i=i a i^ where > and // are 
probability measures that are compactly supported and Ahlfors regular of 
dimension Sj > 0. Let s = maxj(sj). Then 



5. Absolutely continuous measures 

In this section we prove Theorem ll.2[ To prove the Euclidean case, 
the idea is to approximate (in the L 1 sense) a measure with density by 
a combination of uniform measures in squares. Then a measure on a 
manifold can be decomposed as a combination of measures supported 
in charts, and by metric stability the problem reduces to the Euclidean 
case. 

The following key lemma shall be used several times to extend the 
class of measures for which we have precise approximation estimates. 

Lemma 5.1 (Combination). - - Let \i be an absolutely continuous mea- 
sure on M. Let Di (1 ^ i ^ I) be domains of M whose interiors do not 
overlap, and assume we can decompose ji = JZ i=1 ^ where fi l is non- 
zero and supported on Di. Assume moreover that there are numbers 
(oti, . . . , ai) = a such that 

Let /xjv G An be a sequence minimizing the W p distance to \i and define 
Ni (with implicit dependency on N) as the number of points o/supp fi^ 
that lie on D i; the points lying on a common boundary of two or more 
domains being attributed arbitrarily to one of them. 

If the vector (Ni/N)i has a cluster point x = (xi), then x minimizes 

Vp 



F{a;x)= [J2 



p 
o 



and if (Ni/N)i — > x when N — > oo, then 

F(a: x 



W p (ji,fi N ) ~ 
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Note that the assumption that none of the fi l vanish is obviously unec- 
essary (but convenient). If some of the /i* vanish, one only has to dismiss 
them. 

Proof. — For simplicity we denote c p (N) = W^(fi, An)- Let e be any 
positive, small enough number. 

We can find a 5 > and domains D\ C D\ such that: each point of D[ 
is at distance at least 5 from the complement of Di\ if \J % is the restriction 
of fj* to D'i, \n H -^\ VT < e 1+p / d . 

Assume x is the limit of (iVj/iV)j when N — > oo. Let us first prove 
that none of the Xj vanishes. Assume the contrary for some index v. 
then ~Ni = o(N). For each N choose an optimal transport plan IT^r 
from fi to /ijv- Let u N ^ fi l be the part of // that is sent by 11^ to 
the iVj points of supp/iAr that lie in D iy constructed as in the summing 
lemma, and let ijin = //(.E^) — vn{D'^) be the mass that moves from D[ 
to the exterior of Di under HA/. Then the cost of IlA is bounded from 
below by ?tiaA p + W£(i/n, AjvJ- Since it goes to zero, we have —> 
and up to extracting a subsequence vn — > v where y! % ^ v ^ /A The 
cost of IIjv is therefore bounded from below by all number less than 
WP{u N ,A Ni ) < N[ 1/d < iV- 1 /d ) a contradiction. 

Now, let e be any positive, small enough number. By considering 
optimal transport plans between and optimal Aj-supported measures 
of D iy we get that 



c p (N) < "£W^,A 

i 

{oa + ef 



a: 



P /d 



when all Ni are large enough, which happens if N itself is large enough 
given that Xi ^ 0. 

For A large enough, the localization lemma ensures that no mass is 
moved more than 5 by an optimal transport plan between \i and \xn- 
This implies that the cost c p (N) is bounded below by JA W p (fi n , A N .). 
By Instability this gives the bound 



Cp(N) > 



a P (l - efl d + 0(e 



x 



N)p/ d 
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The two inequalities above give us 



x\ 

Now, if x is a mere cluster point of (Ni/N), this still holds up to a 
subsequence. If x did not minimize F(a; x), then by taking best approx- 
imations of fi l supported on x[N points where x' is a minimizer, we would 
get by the same computation a sequence fi' N with better asymptotic be- 
havior than /xjv (note that we used the optimality of hn only to bound 
from above each W^(fi l , A Ni )). □ 

The study of the functional F is straightforward. 

Lemma 5.2. - - Fix a positive vector a = (ai, 012, ■ ■ ■ , cii) and consider 
the simplex X = | ^ 0}. The function F(a; ■) has 

dp 

a unique minimizer x° = in X , which is proportionnal to {a^ +p )i, 
with 




=: \a\g_. 

As a consequence, in the combination lemma the vector (Ni/N) must 
converge to x° . 

Proof. — First F(a; •) is continuous and goes to 00 on the boundary of 
X, so that is must have a minimizer. Any minimizer must be a critical 
point of F p and therefore satisfy 



Yl aP i x i p/d 1? 7< = 



p — p/d— 1 



for all vector (r/j) such that YliVt = 0- This holds only when a p i x l 
is a constant and we get the uniqueness of x° and its expression: 

dp 

= a i 

•^i dp ■ 

The value of F(a; x°) follows. 

In the combination lemma, we now by compacity that (Ni/N) must 
have cluster points, all of which must minimize F(a; ■). Since there is 
only one minimizer, (Ni/N) has only one cluster point and must converge 
to x°. □ 
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Figure 1. An optimal A-supported measure can be used to 
construct a good fc^A-supported measure for all k. 



We are now ready to tackle more and more cases in Theorem 11.21 As 
a starting point, we consider the uniform measure \3 d on the unit cube 
of IR d (endowed with the canonical metric). 



Proposition 5.3. - - There is a number 9(d,p) > such that 

w p (n d ,Aj 



The proof is obviously not new, since it is the same argument that 
shows that an optimal packing (or covering) of the Euclidean space must 
have a well-defined density (its upper and lower densities are equal). 

Proof. — Let c(N) = W p {U d ,A N ). We already know that c(N) « 
N~ p l d , so let A = liminf N p / d c(N) and consider any e > 0. 

Let Ni be an integer such that c(A r 1 ) ^ (A + e)N^ p ^ d and let [i\ £ A Nl 
be nearest to /i. For any integer £, we can write £ = k d + q where 
k = |_^ d J and q is an integer; then q = 0(^ 1-1 / d ) = o(£) where the o 
depends only on d. 

Divide the cube into k d cubes of side length 1/k, and consider the 
element /i^ of A k d Nl obtained by duplicating fii in each of the cubes, 
with scaling factor k~ 1 and mass factor k~ d (see figured]). The obvious 
transport plan obtained in the same way from the optimal one between 
D d and /ii has total cost k~ p c(Ni), so that 



N P/d y kd j (w 1 )p/ d ' 
But since k d ~ £, for £ large enough we get 
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Now N ~ [N/N^Ni so that for N large enough c(N) ^(A + 3e)N~P/ d . 
This proves that lim sup NP/ d c(N) ^ A + 3e for all e > 0. □ 

Note that we used the self-similarity of the cube at many different scales; 
the result does not hold with more general self-similar (fractal) measures, 
see Section [6j 

Now, the combination lemma enables us to extend the validity domain 
of Equation (JT]). 

Lemma 5-4- - - Let p = pX where X is the Lebesgue measure on M. d , p 
is a L 1 non-negative function supported on a union of cubes Ci with non- 
overlapping interiors, side length 5, and assume p is constant on each 
cube, with value pi. Then Equation ([TJ) holds. 

Proof. — Let p % the restriction of p to Cj, removing any cube where p 
vanishes identically Then from Proposition 15.31 we get W p (p\ A N ) ~ 
a>iN~ l l d where 

ai = e(d,p)( Pl 5 d ) l ^5 = 6(d,p)pl /p 5^ 

due to the homogeneity of W p : pi is obtained from \3 d by multiplication 
by pi5 d and dilation of a factor 5. By the combination lemma, we get 
W p {p,A N ) ~ minima; -)N-^ d where 



minF(a, •) = 9(d,p) 



dp 



= o(d, P )\ P \±. 

d+p 

□ 

Lemma 5.5. - - Equation ([I]) holds whenever p is an absolutely contin- 
uous measure defined on a compact domain ofM. d . 

Proof. — For simplicity, we denote (3 = djid + p). Let C be a cube 
containing the support of p. Choose some e > 0. Let p = pX be a 
measure such that p is constant on each cube of a regular subdivision of C, 
is zero outside C, satisfies \p—p\\ ^ 2e l+p / d and such that \p—p\p ^ s\p\p- 
The stability lemma shows that 



W^p, A N ) < WP(p, A {1 _ £)N ) + O 



\P-P\i \ 
2(eN)p/ d J 
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so that, using the hypotheses on p and the previous lemma, 




(9(d,p) + e)P\p\ (3 (l + e)(l - e)- p / d + 0(e) 



for N large enough. 



Symmetrically, we get (again for N large enough) 




Np/ d 



Letting e — >■ 0, the claimed equivalent follows. 



□ 



Lemma 5.6. - - Equation §Q) holds whenever p is an absolutely con- 
tinuous measure defined on a compact domain of~R d , endowed with any 
Riemannian metric. 

Proof. — Denote by g the Riemannian metric, and let C be a Euclidean 
cube containing the support of p. Let e be any positive number, and 
choose a regular subdivision of C into cubes Cj of center pi such that for 
all i, the restriction ^ of g to C» is almost constant: \g(p) — g(pi)\ ^ s/2 
for all p G Cj. Denote by g the piecewise constant metric with value g(pi) 
on Cj. Note that even if g is not continuous, at each discontinuity point 
x the possible choices for the metric are within a factor e 2e one from 
another, and one defines that g(x)(v, v) is the least of the g(pi)(v, v) over 
all % such that x G Cj. In this way, ^ defines a distance function close to 
the distance induced by g and the metric stability lemma holds with the 
same proof. 

If one prefers not using discontinuous metrics, then it is also possible 
to consider slightly smaller cubes C\ C Cj, endow C\ with a constant 
metric, and interpolate the metric between the various cubes. Then one 
uses the L l stability in addition to the metric stability in the sequel. 

Denote by p the density of p with respect to the volume form defined 
by g, by p % the restriction of p to Cj and by pi the density of p l . A 
domain of M. d endowed with a constant metric is isometric to a domain of 
M. d with the Euclidean metric so that we can apply the preceding lemma 
to each p l : denoting by W' v the Wasserstein distance computed from the 



metric g, 



S(d,p)\pi 



i/p 



w;(p\a n ) 



d 
d+p 



N 1 / d 
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The combination lemma then ensures that Wp(fi, A^r) ~ min F(a; ^N^ 1 ^ 1 
where 

d+p 
d \ dp 

Pt v \ 
= 0{d,p)\p\±. 

d+p 

The metric stability lemma gives 

e- £ W;(fi,A N ) <: W p (fi, A N ) <: eW;( ft A N ) 

and we only have left to let e — > 0. □ 

We can finally end the proof of the main theorem. 

Proof of Theorem — Here /i is an absolutely continuous measure 
defined on a compact domain D of M. Divide the domain into a finite 
number of subdomains Di, each of which is contained in a chart. Using 
this chart, each is identified with a domain of M. d (endowed with the 
pulled-back metric of M). By combination, the previous lemma shows 
that Equation §1§ holds. 

Let us now give the asymptotic distribution of the support of any 
distance minimizing /i^. Let A be any domain in M. Let x be the 
limit of the proportion of supp/i^ that lies inside A (x exists up to 
extracting a subsequence). Since the domains generates the Borel o~- 
algebra, we only have to prove that x = J A p^/ JmP^- -^ u ^ ^ s f°U° ws 
from the combination lemma applied to the restriction of \i to A and to 
its complement. □ 



mm F(a,-) = 0(d,p) Ij^ 



6. The dyadic Cantor measure 

In this section we study the approximation problem for the dyadic 
Cantor measure k to prove Theorem 11.31 

Let S°, S 1 be the dilations of ratio 1/3 and fixed point 0, 1. The map 
defined by 

y : n i-> 1/2 5^ + 1/2 5^ 
is 1/3-Lipschitz on the complete metric space of probability measures 
having finite p-th moment endowed with the LP Wasserstein metric. It 
has therefore a unique fixed point, called the dyadic Cantor measure and 
denoted by k. It can be considered as the "uniform" measure on the 
usual Cantor set. 
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By convexity of the cost function and symmetry, C\ := W p (k, Ai) is 
realized by the Dirac measure at 1/2. Using the contractivity of J?, we 
see at once that W p (n,A 2 k) ^ 3~ h ci. Denote by s = log 2/ log 3 the 
dimension of k. We have 

W p ( K ,A 2fe )(2 fc ) 1 /^c 1 

for all integer k. 

To study the case when the number of points is not a power of 2, and 
to get lower bounds in all cases, we introduce a notation to code the 
regions of supp k. Let 7° = [0, 1] and given a word w = e n . . . e\ where 
ei G {0, 1}, define 7™ = S , £n S' en _ 1 • • • S ei [0, 1]. The soul of such an interval 
is the open interval of one-third length with the same center. The sons 
of 7™ are the two intervals 7^J~ where e e {0, 1}, and an interval is the 
father of its sons. The two sons of an interval are brothers. Finally, we 
say that n is the generation of the interval I™ . 

Let N be an integer, and ^jv G Ajv be a measure closest to k, whose 
support is denoted by {xi, . . . ,Xn}- An interval 7™ is said to be terminal 
if there is an Xi in its soul. A point in 7^ is always closer to the center of 
7^ than to the center of its father. This and the optimality of implies 
that a terminal interval contains only one Xj, at its center. 

Since the restriction of k to 7^ is a copy of k with mass 2~ n and size 
3~ n , it follows that 

w p (K, f i N y = w*Y t 2 ~ n z~ np 

Jn 
± w 

where the sum is on terminal intervals. A simple convexity arguments 
shows that the terminal intervals are of at most two (successive) gener- 
ations. 

Consider the numbers Nk = 3-2 fc . The terminal intervals of an optimal 
fiN k must be in generations k + 1 (for 2 k of them) and k + 2 (for 2 k+1 of 
them). Therefore 

W p {K,fi Nk ) p = c? (3"( fc+1 ) p + 3-( fe+2 >) /2 

and finally 

w ! ,( K ,A Iv jjvy«=c 1 (i±fiy 3 m-\ 

Note that the precise repartition of the support does not have any im- 
portance (see figure [2J) . 
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1° 



A. 



J2 



• • • • • • 

o o o o o o 

Figure 2. The four first steps of the construction of the Can- 
tor set; the Cantor measure is equally divided between the 
intervals of a given step. The bullets show the supports of two 
optimal approximation of k by 6-supported measures. We see 
that there is no need for the support to be equally distributed 
between the intervals of the first generation. 



To see that W p (k, A^N 1 ^ has no limit, it is now sufficient to esti- 
mate the factor of c\ in the right-hand side of the above formula. First 

we remark that ( ) is greater than 1 — (1 — 3~ p )/ (2p) which is 

increasing in p and takes for p = 1 the value 2/3. Finally, we compute 

log 3 , 

2/3 ■ 3^ _1 ~ 1.27 > 1. 

Note that the fundamental property of k we used is that the points in 
a given /" are closest to its center than to that of its father. The same 
method can therefore be used to study the approximation of sparser 
Cantor measure, or to some higher-dimensionnal analogue like the one 
generated by four contractions of ratio 1/4 on the plane, centered at the 
four vertices of a square. 

Moreover, one could study into more details the variations in the ap- 
proximations W p (k, An). As said before, here our point was only to show 
the limitations to Theorem 11.21 



7. Link with Centroidal Voronoi Tessellations 

Here we explain the link between our optimization problem and the 
centroidal Voronoi tessellations (CVTs in short). For a complete account 
on CVTs, the reader can consult [5] from where all definitions below are 
taken. Since we use the concept of barycenter, we consider only the case 
M = M. d (with the Euclidean metric). As before, A denotes the Lebesgue 
measure. 
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7.1. A quick presentation. — Consider a compact convex domain f2 
in M d and a density (positive, L 1 ) function p on Jl. 

Given a iV-tuple X = (xi, . . . ,Xn) of so-called generating points, one 
defines the associated Voronoi Tessellation as the collection of convex 
sets 

Vi = {x e Q | \x- Xi\ < \x -Xj\ for all j e {l,N}} 

and we denote it by V(X). One says that Vi is the Voronoi cell of It 
is a tiling of Q, in particular the cells cover Q and have disjoint interiors. 
Each Vi has a center of mass, equivalently defined as 

_ f Vi xp(x)dx 

or as the minimizer of the energy functionnal 

&Vi{9) = \x- g\ 2 p(x)dx. 

JVi 

One says that {V)i is a centroidal Voronoi tessellation or CVT, if for 
all i, gi = Xi. The existence of CVTs comes easily by considering the 
following optimization problem: search for a iV-tuple of points X = 
(xi, . . . , xjv) and a tiling V of Q by N sets Vi, . . . , Vjv which together 
minimize 

JV 
i=l 

A compacity argument shows that such a minimizer exists, so let us 
explain why a minimizer must be a CVT together with its generating 
set. First, each Xi must be the center of mass of V,, otherwise one could 
reduce the total energy by moving Xi to g^ and changing nothing else. 
But also, Vi should be the Voronoi cell of Xi, otherwise there is a j ' ^ i 
and a set of positive measure in Vi whose points are closest to Xj than to 
Xi. Transfering this set from Vi to Vj would reduce the total cost. 

We observe that this optimization problem is exactly that of approxi- 
mating the measure pX in L 2 Wasserstein distance; more precisely, find- 
ing the iV-tuple x that minimizes infy Sv(X) is equivalent to finding the 
support of an optimal p N G closest to pX, and then the Voronoi tes- 
selation generated by X gives the mass of fi^ at each Xi and the optimal 
transport from pX to p^- 
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One says that a CVT is optimal when its generating set is a global 
minimizer of the energy functional 

<§{X) = S V [x){X). 

Optimal CVTs are most important in applications, which include for 
example mesh generation and image analysis (see [5]). 



7.2. Equidistribution of Energy. — The principle of energy equidis- 
tribution says that if X generates an optimal CVT, the energies Sy^Xi) 
of the generating points should be asymptotically independent of % when 
N goes to oo. 

Our goal here is to deduce a mesoscopic version of this principle from 
Theorem 11.21 A similar result holds for any exponent, so that we intro- 
duce the L p energy functionals <§y.{xi) = J v _ \x — Xi\ p p(x)dx, <Sy{X) = 
^2i^Vi( x i) an d ^ P (X) = infy $y{x) = (py, x JX). In particular, an opti- 
mal X for this last functional is the support of an element of Ajy mini- 
mizing the LP Wasserstein distance to p\. 

Note that for p ^ 2 an x minimizing S p {x) need not generate a CVT, 
since the minimizer of Sy is not always the center of mass of Vi (but it 
is unique as soon as p > 1). 

Corollary 7.1. - - Let A be a cube of CI. Let X N = {x± , . . . ,x^} be a 
sequence of N -sets minimizing S p for the density p, and denote by S^{N) 
the average energy of the points of X N that lie in A. Then 

has a limit when N — > oo, and this limit does not depend on A. 

The cube A could be replaced by any domain, but not by any open 
set. Since the union of the X N is countable, there are indeed open sets 
of arbitrarily small measure containing all the points {xf)N,%- 

Proof. — Fix some e > and let A' C A be the set of points that are 
at distance at least e from Q \ A and by A" D A the set of points at 
distance at most e from A. 

First, the numbers N', N" of points of X N in A', A" satisfy 

f n d/{d+p) [ D d/(d+p) 

N' ~ N^- N" ~ N^ A " P 



InP d/{d+p) InP 



,d/(d+p) 
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The localization lemma implies that the maximal distance by which 
mass is moved by the optimal transport between pX and the optimal X N - 
supported measure tends to 0, so that for N large enough the energy of 
all points in A is at least the minimal cost between p\A'X and A^> and 
at most the minimal cost between p\A"X and A aw. 

Letting e — > we thus get that the total energy of all points of X N 
lying in A is equivalent to 



As a consequence we have £ A (N) ~ (9(d,p) J n p d/(d+p) ) N^ d+P ^ d . □ 
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